Web Entity Resolution Algorithm Based on Schema-Aware Meta-Blocking Technology
نویسندگان
چکیده
منابع مشابه
BLAST: a Loosely Schema-aware Meta-blocking Approach for Entity Resolution
Identifying records that refer to the same entity is a fundamental step for data integration. Since it is prohibitively expensive to compare every pair of records, blocking techniques are typically employed to reduce the complexity of this task. These techniques partition records into blocks and limit the comparison to records co-occurring in a block. Generally, to deal with highly heterogeneou...
متن کاملMFIBlocks: An effective blocking algorithm for entity resolution
Entity resolution is the process of discovering groups of tuples that correspond to the same real-world entity. Blocking algorithms separate tuples into blocks that are likely to contain matching pairs. Tuning is a major challenge in the blocking process and in particular, high expertise is needed in contemporary blocking algorithms to construct a blocking key, based on which tuples are assigne...
متن کاملA generic Web-based entity resolution framework
Web data repositories usually contain references to thousands of real-world entities from multiple sources. It is not uncommon that multiple entities share the same label (polysemes) and that distinct label variations are associated with the same entity (synonyms), which frequently leads to ambiguous interpretations. Further, spelling variants, acronyms, abbreviated forms, and misspellings comp...
متن کاملSimplifying Entity Resolution on Web Data with Schema-agnostic, Non-iterative Matching
Entity Resolution (ER) aims to identify different descriptions in various Knowledge Bases (KBs) that refer to the same entity. ER is challenged by the Variety, Volume and Veracity of descriptions published in the Web of Data. To address them, we propose the MinoanER framework that fulfills full automation and support of highly heterogeneous entities. MinoanER leverages a token-based similarity ...
متن کاملK-Means Clustering Algorithm based on Entity Resolution
Entity resolution is the problem of recognizing which entry in database refers to same cluster. in this we have to run the ER in order to reduce the running time and to obtain good results. This paper investigates how we can reduce the running of ER with minimum amount of work using k-means clustering algorithm. In this, clustering can be done according to the matching of entries. We introduce ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Hans Journal of Data Mining
سال: 2020
ISSN: 2163-145X,2163-1468
DOI: 10.12677/hjdm.2020.101002